NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Synthetic Audio Helps for Cognitive State Tasks

https://doi.org/10.18653/v1/2025.findings-naacl.92

Soubki, Adil; Murzaku, John; Zeng, Peter; Rambow, Owen (January 2025, Association for Computational Linguistics)

Full Text Available
Multimodal Belief Prediction

https://doi.org/10.21437/Interspeech.2024-2103

Murzaku, John; Soubki, Adil; Rambow, Owen (September 2024, ISCA)

Full Text Available
Examining Gender and Power on Wikipedia through Face and Politeness

https://doi.org/10.18653/v1/2024.sigdial-1.4

Soubki, Adil; Choi, Shyne E; Rambow, Owen (January 2024, Association for Computational Linguistics)

Full Text Available
Training LLMs to Recognize Hedges in Dialogues about Roadrunner Cartoons

https://doi.org/10.18653/v1/2024.sigdial-1.18

Paige, Amie; Soubki, Adil; Murzaku, John; Rambow, Owen; Brennan, Susan E (January 2024, Association for Computational Linguistics)

Hedges allow speakers to mark utterances as provisional, whether to signal non-prototypicality or “fuzziness”, to indicate a lack of commitment to an utterance, to attribute responsibility for a statement to someone else, to invite input from a partner, or to soften critical feedback in the service of face management needs. Here we focus on hedges in an experimentally parameterized corpus of 63 Roadrunner cartoon narratives spontaneously produced from memory by 21 speakers for co-present addressees, transcribed to text (Galati and Brennan, 2010). We created a gold standard of hedges annotated by human coders (the Roadrunner-Hedge corpus) and compared three LLM-based approaches for hedge detection: fine-tuning BERT, and zero and few-shot prompting with GPT-4o and LLaMA-3. The best-performing approach was a fine-tuned BERT model, followed by few-shot GPT-4o. After an error analysis on the top performing approaches, we used an LLM-in-the-Loop approach to improve the gold standard coding, as well as to highlight cases in which hedges are ambiguous in linguistically interesting ways that will guide future research. This is the first step in our research program to train LLMs to interpret and generate collateral signals appropriately and meaningfully in conversation.
more » « less
Full Text Available
Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground

https://doi.org/10.18653/v1/2024.findings-acl.880

Soubki, Adil; Murzaku, John; Yousefi_Jordehi, Arash; Zeng, Peter; Markowska, Magdalena; Mirroshandel, Seyed Abolghasem; Rambow, Owen (January 2024, Association for Computational Linguistics)

Evaluating the theory of mind (ToM) capabilities of language models (LMs) has recently received a great deal of attention. However, many existing benchmarks rely on synthetic data, which risks misaligning the resulting experiments with human behavior. We introduce the first ToM dataset based on naturally occurring spoken dialogs, Common-ToM, and show that LMs struggle to demonstrate ToM. We then show that integrating a simple, explicit representation of beliefs improves LM performance on Common-ToM.
more » « less
Full Text Available

Search for: All records